Enhanced Server Fault-tolerance Techniques for Seamless User Experience
نویسنده
چکیده
User applications, such as email, calendar, maps, are migrating from local desktop machines to data centers due to the many advantages offered by such a computing paradigm. Furthermore, this trend is creating a marked increase in the deployment of servers at data centers. To ride the price/performance curves for CPU, memory and other HW, inexpensive commodity machines although having low availability numbers are the most cost effective choices for a data center. However, increased server failures cause service outages and degrade user experience which in turn results in lost revenue for businesses. Also, emerging web applications put additional demands on server fault-tolerance. For example, if a user is browsing a map service like Google, Yahoo or MSN maps, a server failure leading to an outage of more than a few seconds is detectable by a user and hence degrades user experience. In this thesis, I propose three novel techniques aimed at improving server fault-tolerance: (1) ST-TCP, which is an extension of TCP to tolerate server failures. This is done by using an active-backup which replicates the state of a primary and seamlessly takes over a TCP connection on primary server failure; (2) CRAFT, where the TCP splicing mechanism is enhanced to make it both fault-tolerant and more scalable; this then forms the basis of a scalable and fault-tolerant web server architecture that specifically addresses server faulttolerance issues for highly interactive or real time applications; and, (3) Call-preserving failover, which is an efficient and scalable fault-tolerance mechanism for migrating IP telephony calls to an alternate call controller.
منابع مشابه
Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid
Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...
متن کاملEnhanced Server Fault Tolerance for Improved User Experience ; CU-CS-1037-08
Interactive applications such as email, calendar, and maps are migrating from local desktop machines to data centers due to the many advantages offered by such a computing environment. Furthermore, this trend is creating a marked increase in the deployment of servers at data centers. To ride the price/performance curves for CPU, memory and other hardware, inexpensive commodity machines are the ...
متن کاملClient-Transparent Fault-Tolerant Web Service
Most of the existing fault tolerance schemes for Web servers detect server failure and route future client requests to backup servers. These techniques typically do not provide transparent handling of requests whose processing was in progress when the failure occurred. Thus, the system may fail to provide the user with confirmation for a requested transaction or clear indication that the transa...
متن کاملSeamless Mobility with Personal Servers
We describe the concept and the taxonomy of personal servers, and their implications in seamless mobility. Personal servers could offer electronic services independently of network availability or quality, provide a greater flexibility in the choice of user access device, and support the key concept of continuous user experience. We describe the organization of mobile and remote personal server...
متن کاملFuxi: a Fault-Tolerant Resource Management and Job Scheduling System at Internet Scale
Scalability and fault-tolerance are two fundamental challenges for all distributed computing at Internet scale. Despite many recent advances from both academia and industry, these two problems are still far from settled. In this paper, we present Fuxi, a resource management and job scheduling system that is capable of handling the kind of workload at Alibaba where hundreds of terabytes of data ...
متن کامل